Skip to content

Conversation

@Aydin-ab
Copy link
Contributor

@Aydin-ab Aydin-ab commented Jan 6, 2026

follow up of this (closed for inactivity over the holidays):
#58571

There are a lot of files changed but the main technical content to review are the README.ipynb

For context, the goal is to refactor this current template
https://console.anyscale.com/template-preview/llm_batch_inference
And split it into 2: one on text data, and the other on vision data
Both are very similar, but should be read independently as two distinct content 👍

Aydin Abiar added 30 commits November 12, 2025 12:10
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Aydin Abiar added 2 commits January 7, 2026 17:25
…of batch size, concurrency, more refs to docs links, refactor quantization and model parallelism section for more readability, add image validation, mention anyscale runtime, pin datasets version

Signed-off-by: Aydin Abiar <[email protected]>
@Aydin-ab
Copy link
Contributor Author

Aydin-ab commented Jan 8, 2026

@nrghosh
Followed your suggestions + the ones in your main comment (changing the prompt task etc)
I'll test the new code tomorrow but if the content looks ok now let me know 👍 Thanks a lot

@nrghosh
Copy link
Contributor

nrghosh commented Jan 8, 2026

/gemini review

Copy link
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Aydin-ab - see also cursor/gemini comments when it comes to code. As long as you are able to run them successfully, should be free of serious bugs now.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces new LLM batch inference examples for both text and vision data, along with their corresponding CI configurations and helper scripts. The changes effectively split the existing template into two distinct, independently readable content pieces, which is a good refactoring step. The new examples demonstrate how to use Ray Data LLM APIs for batch inference, including data preparation, processor configuration, and scaling considerations. The addition of CI scripts ensures these examples remain functional.

However, there are a few areas that could be improved for robustness and clarity:

  • The nb2py.py scripts use specific string matching to modify dataset limits for CI. This approach is brittle and could break if the exact string in the notebook changes.
  • Some comments in the Jupyter notebooks are slightly misleading regarding dataset size limits.
  • The standalone Python scripts contain hardcoded configuration values that would ideally be configurable for real-world use cases.

Aydin Abiar added 3 commits January 8, 2026 11:26
Aydin Abiar added 4 commits January 8, 2026 13:43
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
@Aydin-ab Aydin-ab added the go add ONLY when ready to merge, run all tests label Jan 12, 2026
@Aydin-ab Aydin-ab requested a review from nrghosh January 12, 2026 22:59
Copy link
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! minor comments but overall LGTM

  • datasets version in vision template (4.4.2) shows diff from job.yaml (4.4.1)
  • imports at top (not inline - style) for python files
  • image mode handling potential issue with batch_inference_vision.py and batch_inference_vision_scaled.py

when you open images with PIL, there can be issues with modes - something like

  image = Image.open(BytesIO(image))                                                                                                  
  if image.mode != 'RGB':                                                                                                             
      image = image.convert('RGB')  

could improve robustness

  • partition counts are hardcoded (64/128/256) but explanation says "2-4x the worker (GPU) count" - so with concurrency=4 and 128 partitions, that's 32x the gpu count

overall great - extensive use of the apis and clear/helpful distinction between batch/online inference and well written scaling guidance + bits on model paralellism

Copy link
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues docs An issue or change related to documentation go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants